首页> 外文OA文献 >Analysis of genomic variation in non-coding elements using population-scale sequencing data from the 1000 Genomes Project
【2h】

Analysis of genomic variation in non-coding elements using population-scale sequencing data from the 1000 Genomes Project

机译:使用来自1000个基因组计划的人口规模测序数据分析非编码元件的基因组变异

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

In the human genome, it has been estimated that considerably more sequence is under natural selection in non-coding regions [such as transcription-factor binding sites (TF-binding sites) and non-coding RNAs (ncRNAs)] compared to protein-coding ones. However, less attention has been paid to them. To study selective pressure on non-coding elements, we use next-generation sequencing data from the recently completed pilot phase of the 1000 Genomes Project, which, compared to traditional methods, allows for the characterization of a full spectrum of genomic variations, including single-nucleotide polymorphisms (SNPs), short insertions and deletions (indels) and structural variations (SVs). We develop a framework for combining these variation data with non-coding elements, calculating various population-based metrics to compare classes and subclasses of elements, and developing element-aware aggregation procedures to probe the internal structure of an element. Overall, we find that TF-binding sites and ncRNAs are less selectively constrained for SNPs than coding sequences (CDSs), but more constrained than a neutral reference. We also determine that the relative amounts of constraint for the three types of variations are, in general, correlated, but there are some differences: counter-intuitively, TF-binding sites and ncRNAs are more selectively constrained for indels than for SNPs, compared to CDSs. After inspecting the overall properties of a class of elements, we analyze selective pressure on subclasses within an element class, and show that the extent of selection is associated with the genomic properties of each subclass. We find, for instance, that ncRNAs with higher expression levels tend to be under stronger purifying selection, and the actual regions of TF-binding motifs are under stronger selective pressure than the corresponding peak regions. Further, we develop element-aware aggregation plots to analyze selective pressure across the linear structure of an element, with the confidence intervals evaluated using both simple bootstrapping and block bootstrapping techniques. We find, for example, that both micro-RNAs (particularly the seed regions) and their binding targets are under stronger selective pressure for SNPs than their immediate genomic surroundings. In addition, we demonstrate that substitutions in TF-binding motifs inversely correlate with site conservation, and SNPs unfavorable for motifs are under more selective constraints than favorable SNPs. Finally, to further investigate intra-element differences, we show that SVs have the tendency to use distinctive modes and mechanisms when they interact with genomic elements, such as enveloping whole gene(s) rather than disrupting them partially, as well as duplicating TF motifs in tandem.
机译:在人类基因组中,据估计,与蛋白质编码相比,非编码区域(例如转录因子结合位点(TF结合位点)和非编码RNA(ncRNA))处于自然选择状态的序列要多得多。那些。但是,对它们的关注较少。为了研究对非编码元件的选择性压力,我们使用了最近完成的1000个基因组计划试验阶段的下一代测序数据,与传统方法相比,该数据可以表征全范围的基因组变异,包括单个-核苷酸多态性(SNP),短插入和缺失(indels)和结构变异(SV)。我们开发了一个框架,用于将这些变异数据与非编码元素进行组合,计算各种基于总体的指标以比较元素的类和子类,并开发可识别元素的聚合程序以探查元素的内部结构。总体而言,我们发现TF结合位点和ncRNA对SNP的选择性约束比编码序列(CDS)少,但与中性参考相比约束更大。我们还确定三种类型变异的相对约束量通常是相关的,但存在一些差异:与直觉相反,与SNP相比,与ind相比,与inNP相比,对TF结合位点和ncRNA的选择性更高。 CDS。在检查了一类元素的整体属性之后,我们分析了元素类中子类的选择性压力,并表明选择程度与每个子类的基因组特性有关。例如,我们发现具有较高表达水平的ncRNA倾向于在更强的纯化选择下,​​而TF结合基序的实际区域比相应的峰区域在更强的选择压力下。此外,我们开发了元素感知的聚集图来分析元素线性结构上的选择性压力,并使用简单的自举和块自举技术来评估置信区间。例如,我们发现,微RNA(特别是种子区域)及其结合靶标对SNP的选择压力都比其直接基因组环境高。另外,我们证明TF结合基序中的取代与位点保守性成反比,并且不利于基序的SNP比有利的SN​​P更具选择性。最后,为了进一步研究元素内差异,我们发现SV与基因组元素相互作用时倾向于使用独特的模式和机制,例如包裹整个基因而不是部分破坏它们,以及复制TF基序。串联。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号